Distributed data clustering in multi-dimensional peer-to-peer networks

نویسندگان

  • Stefano Lodi
  • Gianluca Moro
  • Claudio Sartori
چکیده

Several algorithms have been recently developed for distributed data clustering, which are applied when data cannot be concentrated on a single machine, for instance because of privacy reasons or due to network bandwidth limitations, or because of the huge amount of distributed data. Deployed and research Peer-to-Peer systems have proven to be able to manage very large databases made up by thousands of personal computers resulting in a concrete solutions for the forthcoming new distributed database systems to be used in large grid computing networks and in clustering database management systems. Current distributed data clustering algorithms cannot be applied to such kind of networks because they expect data be organized according to traditional distributed database management systems where the distribution of the relational schema is planned a-priori in the design phase. In this paper we describe methods to cluster distributed data across peer-to-peer networks without requiring any costly reorganization of data, which would be infeasible in such a large and dynamic overlay networks, and without reducing their performance in message routing and query processing. We compare the data clustering quality and efficiency of three multi-dimensional peer-to-peer systems according to two well-known clustering techniques.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

SDC: A Distributed Clustering Protocol for Peer-to-Peer Networks

Network clustering can facilitate data discovery and peerlookup in peer-to-peer systems. In this paper, we design a distributed network clustering protocol, called SCM-based Distributed Clustering (SDC), for peer-to-peer networks. In this protocol, clustering is dynamically adjusted based on Scaled Coverage Measure (SCM), a practical clustering accuracy measure. By exchanging messages with neig...

متن کامل

Distributed Data Clustering in Peer-to-Peer Networks: A Technical Review

Clustering as one of the main branches of data mining, has gained an important place in the different applied fields. On the other hand, Peer-to-Peer (P2P) networks with features such as simplicity, low cost communication, and high availability resources, have gained a worldwide popularity in the present days. In P2P network, high volumes of data are distributed between dispersed data sources. ...

متن کامل

Multi-objective optimization based privacy preserving distributed data mining in Peer-to-Peer networks

This paper proposes a scalable, local privacy-preserving algorithm for distributed peer-to-peer (P2P) data aggregation useful for many advanced data mining/analysis tasks such as average/sum computation, decision tree induction, feature selection, and more. Unlike most multi-party privacy-preserving data mining algorithms, this approach works in an asynchronous manner through local interactions...

متن کامل

P2P Network Trust Management Survey

Peer-to-peer applications (P2P) are no longer limited to home users, and start being accepted in academic and corporate environments. While file sharing and instant messaging applications are the most traditional examples, they are no longer the only ones benefiting from the potential advantages of P2P networks. For example, network file storage, data transmission, distributed computing, and co...

متن کامل

DisTriB: Distributed Trust Management Model Based on Gossip Learning and Bayesian Networks in Collaborative Computing Systems

The interactions among peers in Peer-to-Peer systems as a distributed collaborative system are based on asynchronous and unreliable communications. Trust is an essential and facilitating component in these interactions specially in such uncertain environments. Various attacks are possible due to large-scale nature and openness of these systems that affects the trust. Peers has not enough inform...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010